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Abstract 

In this paper we study the problem of predictability in partially observable discrete event systems, 
i.e., the question whether an observer can predict the occurrence of a fault. We extend the definition 
of predictability to consider the time interval where the fault will occur: the (i, .^-predictability does 
not only specify that the fault will be predicted before it occurs, but also that the predictor will be able 
to predict that its occurrence will occur in i to j observations from now. We also provide a quadratic 
algorithm that decides predictability of the system. 
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1 Motivation 

A fault is predictable if its unavoidable occurrence can always be determined in advance. Being able to 
predict the fault allows the supervisor to step in and take preventive actions, such as reconfiguring the 
system, replacing damaged components, or shutting the system down. 

Predictability has been greatly studied in the last decade (some references are provided in the related 
work section). To be maximally effective, the prediction should satisfy two criteria: it should be made well 
in advance, so that the operator has enough time to decide for and perform corrective actions; it should be 
reasonably precise, so that the repair is not performed too early if that is unnecessary. The first contribution 
of this paper is the formalisation of these two objectives: we define the notion of (*, .^-predictability, a 
generalisation of the existing notion of predictability that states that faults can always be predicted at least 

1 timesteps in advance and, when this prediction is made, the fault will not occur in more than j timesteps. 

We study this definition of predictability and we propose an algorithm that computes all pairs (i,j) for 
which predictability holds. We show that this algorithm runs in quadratic time. This is an improvement 
over the existing predictability algorithms that run in 0(n 4 ). 

This paper is organised as follows. Next section presents preliminary definitions. Our definition of 
predictability is presented in Section 3, together with a discussion of its benefits. Our algorithm is given in 
Section 4. Existing approaches are discussed in Section 5. 

2 Preliminaries 

2.1 Discrete Event Systems 

This work is applicable to finite discrete event systems (DES) [CL99]. The system is modeled as a DES and 
is assumed fixed for this paper. A (finite) DES is a model for dynamic systems where the state space is 
discrete (and finite) and is modeled as a finite state machine. 

A (partially observable) finite state machine (FSM) is a tuple A = (Q, E, T, qi, E 0 ) where Q is a finite 
set of states, E is a finite set of events, TCQxExQisa finite set of transitions, q\ £ Q is the initial state, 
and E 0 C E is a finite set of observable events. 
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To simplify notations, it is assumed that the FSM is deterministic, i.e., there is only one initial state and 
there are no two transitions originating from the same state and labeled with the same event: 

{(?) e, q[), (q, e, q' 2 )} C T => q[ = q' 2 . 

This assumption is not restrictive as any non-deterministic FSM can be turned into a deterministic FSM 
that is equivalent from a predictive/monitoring perspective, by adding a number of states and transitions 
smaller than the original number of transitions and without affecting the overall complexity of the algorithm. 
Furthermore the algorithms presented later apply to non-deterministic FSM as well. The assumption of 
determinism is however convenient because there a one-to-one mapping between a path and a trace (defined 
below). 

A path p is a double sequence of states and events qo A ... A qk such that Vi E {1,..., k}, {qi-i, e,, qi) £ 
T. The label u, called the trace, of the path is the sequence of events u = e i... e^. That there exists a path 
labeled by u from qo to qk is denoted <?o —> qk] the state qk reached from q through u is denoted q —> and 
the fact that it exists is written (q A) £ Q. 

The definition of a path is extended to infinite paths qo A q± A ... such that for all i > 0, qo A ... -A qi 
is a path. It is assumed that the system is live, i.e., that for any state q £ Q, there exists an outgoing 
transition: Mq £ Q, 3e £ E. 3 q' £ Q. (q, e, q') £ T. Infinite traces are denoted w and finite ones u. The 
prefix relation is denoted u Q v where v may be finite or infinite. We extend the notation (q A) £ Q to 
infinite traces, with the meaning Vu C w. (q A) £ Q. 

The system starts in state qo = qi and takes an infinite path. The language C = {w £ E“ | (qi A) £ Q} 
is defined as the set of infinite words over E that label an infinite path on the FSM starting from the initial 
state. 

Given a finite word u £ E*, the observation of u is the traditional projection of u on the set of observable 
events: 

i e if u = e, 

obs(it') if u — eu' and e £ E \ E 0 , 

e obs(u') if u = eu' and e £ E 0 

where e is the empty sequence. As usual it is assumed that any infinite trace generates infinitely many 
observations. 

2.2 Faults 

The system can be subject to faults, i.e., types of behaviour that we wish to prevent. Faults can be defined 
as a single event or as a subtle pattern of events [JMPC06]. These two definitions are however very similar: 
the important notion here is that it can also be modeled as the property of the current (possibly augmented) 
state of the system (normal state vs. faulty state). A set F C Q of states will represent the faulty states: 
a path is faulty if it reaches a faulty state (3 i. q t £ F). The faulty aspect of a trace u will therefore be 
represented by (qi A) £ F. Notice that, by definition, any transition from a faulty state leads to a faulty 
state: 

{q, e, q') £ T A q £ F => q' £ F. 

It is assumed that the initial state is not faulty. The set of infinite faulty traces is represented by language 
Cf C C, which is formally defined as the set of traces whose path from <71 is faulty. 

3 (i, j)-Predictability 

3.1 Predictability 

Fault prediction is the problem of deciding whether an operator should be warned that a fault is bound to 
occur. We want to give guarantees about the prediction of the fault. This guarantee is expressed by a tuple 
(i,j) where i (resp. j) is a lower bound (resp. upper bound) of the fault occurrence. 
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In the following a time interval is a pair of elements (x, y) from N U {00} (the natural numbers including 
zero and infinity) so that x < y. We define the operator 0 so that (x,y)Q 1 = (x 0 1 , y © 1 ) where lQ\= i 
if l G { 0 , 00} and f 0 1 = l — 1 otherwise. A time interval (a:, y) can be interpreted as the set of numbers 
between x and y. Under this interpretation the relation (x,y) C (x',y') is equivalent to x' < x < y < y'\ 
and (x,y) U ( x',y' ) = (min(:r, a/), max(t/, y')). 1 

A predictor is a machine P that, given a sequence o of observations, returns a time interval (x, y) = P(o), 
meaning that any trace that matches this sequence will not become faulty before x more observations 
are collected (if x = 0, the fault may already have occurred) but will definitely be faulty before y more 
observations are (or returns y = 00 if the fault is not predicted—it may never occur). In the coming 
definition, notice that, while this is not explicitely stated, if u and u' are two different traces that generate 
the same observations (obs(u) = obs(it')) then the predictor should obviously give the same prediction: 
P(obs(u)) = P(obs(w')). Hence the predictor has to be conservative so as to satisfy the two constraints 
given in the definition for all relevant traces. In other words, there are two types of uncertainty: uncertainty 
about what happened until now (we only know that the behaviour generated the sequence o but the actual 
behaviour is unknown); uncertainty about what will happen from now. 

Definition 1 A predictor is a machine P that takes a sequence of observations and that returns a time 
interval with the following property: Vw 6 C. Vui,U2 such that u\ E w 2 E w, let ( x,y ) = P(obs(ui)), then 

• |obs(u2)| — |obs(«i)| < x => (<71 -f) £ F and 

• |obs(w 2 )| - |obs(ui)| > y => {q\ ^) G F. 

An (i,j)-predictor has the added requirement that, before a fault occurs, a prediction should be made 
about the fault occurrence that is tighter than, or as tight as, (i,j). 

Definition 2 A predictor P is an (i,j)-predictor for a given trace w G Cp if 

3 u E w. P(obs(u)) C (i,j). 

A predictor is an (i, j)-predictor if it is an (i, j)-predictor for every trace w G Cf- 

('«, j)-predictability is then the property that an (z,j)-predictor exists. We also define i-predictability, the 
property that the fault occurrence can be predicted at least i observations before it occurs; and predictability , 
the property that the fault can be predicted before it occurs. 

Definition 3 A system is (i, j)-predictable if there exists an (i,j)-predictor for it. It is ©predictable if it 
is (i, j) -predictable for some j G N. It is predictable if it is i-predictable for some iGN \ { 0 }. 

Notice that the condition j G N (i.e., j ^ 00) is necessary because forbidding the upper bound of P(o) 
to be 00 forces the predictor to predict the fault before its occurrence (i.e., the predictor asserts that the 
fault will definitely occur). Similarly we forbid i = 0 because we want the fault to be predicted in a state 
where it has not occurred yet. 


Observation pattern 

Prediction 

No d 

[2,oo] 

Last observed event is d 

[1,2] 

Second last observed event is d 

[0,1] 

Contains d followed by two or more observed events 

[0,0] 


Table 1 : A ( 1 , 2 )-predictor for the system of Figure 1 . 

1 Nol ice that (x. y) U (x',y') may contain elements that are neither in (x. yj nor in (xb y f ). 


3 



CCD 


CCD 



Figure 1: Example of a system; t is the only unobservable event. 

These definitions are illustrated with the example of Figure 1. The faulty states are represented with 
grey filling. Table 1 presents one predictor. For instance the first pattern of the predictor specifies that if the 
sequence of observations does not contain the event c then the prediction is ( 2 , oo), i.e., there will be at least 
two observations before the fault occurs, and it may never occur. The second pattern specifies that if the 
last event of the sequence of observations is d then the prediction is ( 1 , 2 ), meaning that a faulty state will 
be reached after one or two more observations are received. Similarly for the third pattern: the prediction 
is (0,1), i.e., it may already have occurred or it will when the next observation has been received. Finally 
the last pattern indicates a situation where the fault definitely occurred. 

We illustrate that the machine in Table 1 (denoted P here) indeed presents a predictor on a few selected 
examples. We first assume a trace u\ = aba with prediction P(obs(iti)) = (2, oo). Consider its continuation 
U2 = u\b\ then the length difference between obs(rt 2 ) and obs(ui) is 1 , which is less than 2 ; therefore U2 
has to satisfy (qi -$) ^ F. which it does. Consider instead 112 = U\dc-, the length difference is this time 2, 
which means that none of the constraints in Definition 1 applies. Predictor P is not claimed to be “optimal” 
(where the precise definition of optimality is presented later); nevertheless one might claim that a prediction 
of ( 2 , 00 ) is not very precise given that any continuation of u\ requires three observable events to reach a 
faulty state ( dca is the shortest). Notice however that P does not know that the system trace is U\ : it only 
knows the sequence of observations generated by u\, i.e., aba , which is identical to the sequence generated 
by u\ = abta; this trace u[ can reach a faulty state in just two observable steps (da), which forces the lower 
bound of P(obs(rti)) to be at most 2. 

Assume now u\ = abad with prediction (1,2). Consider the non-faulty trace 112 = u\c\ the length 
difference is 1, which means that none of the constraints in Definition 1 applies. Consider instead the faulty 
trace U 2 = U\ca; the length difference is 2, which is greater or equals to the upper bound of the prediction; 
therefore 112 has to satisfy (<71 ^1) e F, which it does. 

As we can see any faulty trace has to include d, which means that the flow of observations generated 
by a faulty trace will eventually be associated with the prediction (1,2). Therefore the system is (1,2)- 
predictable. We can however show that the system is not (2,2)-predictable. Indeed consider the infinite faulty 
trace w = adca “ where the exponent “ indicates an infinite repetition of a. For w to be (2, 2)-predictable, 
we need to exhibit one of its prefix u\ such that one can predict P'(obs(wi)) C (2,2) (here P'(obs(wi)) 
should exactly equal (2,2)). Assume that such a prefix and such a predictor exist. Following Definition 1, 
consider a continuation 112 of u\ that generates one more observation; because |obs(i( 2 )| — |obs(wi)| = 1 , 
U 2 should not lead to a faulty state. Therefore u\ has to belong to the set {e,a,ad}. Similarly however, if 
U 2 is chosen such that its observable length is exactly two more than that of u\, then U 2 has to lead to a 
faulty state. Therefore u\ = ad and P'(obs(wi)) = P'(ad) = (2,2). Consider however the trace u[ = tad 
and its continuation u 2 = u[a. Clearly P'(obs(u' 1 )) = P'(ad) = (2,2). According to Definition 1 since 
|obs(t4)| — |°bs(u / 1 )| = 1 < 2 u' 2 should not lead to a faulty state. It does however, which shows that no 
prefix u\ of w satisfies P'(u\) C (2, 2) for some predictor P'. 
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3.2 Discussion 


Predictors can be used to stop or rectify the system before it produces a faulty behaviour. Being able to 
predict a fault well in advance helps getting prepared for intervention; this is represented by the i parameter 
(which should be maximised). Being able to predict the time when the fault is likely to happen prevents hasty 
corrections; this is represented by the difference (j — i) (which should be minimised). There is an implicit 
assumption here that the number of observations is indicative of time: for instance the system generates one 
observation per minute. This is particularly relevant to hybrid systems modeled as DES [VTPS15]. 

Ideally the system should be (i, j)-predictable with a large i value and a small (j — i) value. 

We illustrate the definition of predictability by considering the example of the potentially critical sub¬ 
system of an aircraft. This example is, of course, very limited. For such a system it is important to predict 
faults well in advance in order to take preventive measures (e.g., modify the flight path in order to stay 
near to an aerodrome). On the other hand it is also important to provide a precise prediction as emergency 
landings are expansive. 

In order to provide an early prediction we might want the system to be at least 30-predictable. At that 
stage however, we do not need a precise prediction: a (30,10 000+)-predictability is still acceptable. For the 
second requirement however, we want to be able to predict the fault quite accurately, for instance (15, 240)- 
predictability which suggests that the fault will occur in the next four hours and that an unscheduled landing 
is now necessary. So, interestingly, this example requires two different predictability properties. 


4 Solving Interval Predictability Problems 

This section shows how to verify the predictive level of a given system. 

4.1 Predictive levels 

We first show that, while the definition of predictability involves two parameters, the dimension of pre¬ 
dictability is actually much smaller. 

Lemma 1 A system that is (i, j)-predictable is also 

1. (b (J + 1 ))-predictable (if j ^ oo) and 

2. ((i — 1), (j © 1))-predictable (if i >2). 

Proof That (i, ^-predictability entails (i, j + l)-predictability is trivial from Definition 2: an (i , j)-predictor 
is also an ( i,j + l)-predictor since the constraint on the prediction is strictly weaker. 

Assume that the system is (i, j)-predictable with i > 2, i.e., there exists an (i, j)-predictor P. Then 
define P' such that 

• P'(e) = P(e) and 

• P'{oe) = P(o) 0 1. 

It is easy to show that P' is a predictor (if the prediction P{o) was correct, then the prediction P'{oe) is 
correct). Furthermore it is easy to prove that P' is an (i — 1), ( j — l)-predictor: if P(obs(u)) C (j, j) for some 
prefix u of w, then for the prefix tie, P'( tie) = P(u) 0 1 C (i — 1, j — 1) (or (i — 1, oo) if j = oo). □ 

Lemma 1 shows that some levels of predictability are strictly weaker than others. There are however levels 
of predictability that are mutually incomparable. Consider the examples of Figure 2. Clearly the system of 
Figure 2a is (1, l)-predictable because a fault is always preceded by two os and the occurrence of the first a 
implies that the fault will be reached after the next observation; on the other hand it is not ( 2 ,3)-predictable 
because when the fault becomes unavoidable (i.e., it will occur after less than 3 observations) then the fault 
can (and, actually, will) occur after less than 2 observations. The system of Figure 2b is (2,3)-predictable 
because the fault is always preceded by aaa or aabb and because observing a first a implies that the fault is 
unavoidable; on the other hand, it is not ( 1 , l)-predictable because, after observing aa, it is not possible to 
decide whether the fault will occur immediately or after two observations. 
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a. Example of a (1, l)-predictable set of 
faults which is not (2, 3)-predictable. 



predictable) set of faults which is not (1,1)- 
predictable. 


Figure 2: Illustrating that some predictive levels are not comparable. 


4.2 Characterisation of Predictability 

In order to determine whether a system is predictable we define notions of distance between a system state 
and a fault. 

Definition 4 The minimal distance between q and the set F of states denoted dminp(q), is the minimum 
number of observations before reaching F from q 

dminF^q ) = min |obs(u)| 

(<A) £F 

and oo if there is no such u. The maximal distance between q and a set of states F, denoted dmaxp(q), is 
the maximum number of observations before reaching F from q 

dmaxF(q) = max |obs(it)| + l, 

(-/A) &Q\F 

oo if there is no bound to |obs(u)|, and 0 if q £ F (i.e., there is no such u). 

Notice that these distances are bounded by the number |Q| of states when they are different from oo. 
Indeed, if dminp(q) > |Q| the corresponding trace includes a cycle, and a smaller trace therefore exists (by 
cutting the cycle). Similarly if dmaxp(q) > \Q\ the corresponding trace includes a cycle, and a longer trace 
exists (where the cycle can be taken once more). 

The minimal and maximal distances give us a first estimate of the time interval before fault. To simplify 
notations we write distancesp(q) to denote the time interval ( dminp(q ), dmaxp(q)). 

Lemma 2 For all trace w £ C and all prefix u E w, if P is a predictor then 

distancesp(qi A) C P(obs(u)). 

Proof Let (■ i,j ) = distancesp(qi A) be the time interval of the state ( q\ A) and let (x,y) = P(obs(u)) be 
the prediction of obs(w). 

By definition of the minimal distance i, there exists a trace u' such that u E u! E w, |obs(id) | — |obs(u) | = i, 

and (qi A) £ F. Therefore i < x would contradict the first condition in the definition of a predictor (Def. 1). 
Furthermore if j > 0, then by definition of the maximal distance j, there exists a trace u' such that 

u E u! E w, |obs(u')| — |obs(u)| = j — 1, and (q\ A) £ Q\F. Therefore j — 1 > y would contradict the 
second condition in the definition of a predictor (Def. 1). 

If j = 0 then i = 0 and y > x > i = j implies y > j. □ 

This result can be generalised to the collection of states that an observer can assume the system to be in 
(the belief state). Formally the belief state B(o) is the set of states that the system can be in if the sequence 
o of observations has been observed: 

B(o) = {q £ Q | 3 u. qi A q A obs(u) = o}. 
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Corollary 3 For all predictor P, for all sequence o of observations 


distancesp(q) C P(o). 

qeB(o) ) 


Proof If q £ B(o) is an element of the belief state then, by definition of the belief state, there exists a trace 
u such that qj -4 q and o = obs(u). From Lemma 2, distancesF{q) E P(o), which also applies to the union 
of these elements. □ 

Actually it is possible to characterise the “optimal” predictor in terms of distances. Let P and P' be 
two predictors. We say that P is stronger than P', denoted P >z P' , iff P(o) C P'(o) for all o. 2 We denote 
P* the optimal predictor: P* = max^{P | P is a predictor}. It should be clear that the optimal predictor 
is well-defined and unique. 


Lemma 4 The optimal predictor P* is exactly the predictor that satisfies P*(o) = 
for all sequence o of observations. 


(u 


qe8(o) 


distances 



Proof Let (i,j) = (U g eB( 0 ) distances p(q)J and (x,y) = P*(o). From Corollary 3 we already know that 

(i,j) C (x,y). We only need to prove that P(o) = ( i,j ) is a correct prediction. 

Following Definition 1 let w £ C be an infinite trace and let iti,it 2 be two finite traces such that u\ E 
U2 E w and obs(iti) = o. Let us call q\ the state reached by u± and q2 the state reached by U2 : qi -4 <le- By 
definition of the belief state, q\ £ B{o). To prove that P(o ) is a correct prediction we need to prove that the 
two conditions of Definition 1 are satisfied. 

Assume that <72 ^ P; we shall prove that the premise of the second condition in Definition 1 is not 
satisfied. By definition of the maximal distance of q\: dmaxp(qi) > |obs(i( 2 )| — |obs(«i)|. Since we know 
j > dmaxF(qi), it clearly holds that |obs(u 2 )| — |obs(ui)| < j. 

Assume instead that q 2 £ F; we shall prove this time that the premise of the first condition is not 
satisfied. By definition of the minimal distance of q\: dminF^qi) < |obs(w 2 )| — |obs(iti)|. Since we know 
i < dminF{qi), it clearly holds that |obs(u 2 )| — |obs(ui)| > i. □ 

As it turns out P*(o) equals the union of exactly two intervals. 


Lemma 5 For all sequence o of observations such that B(o) ^ 0, there exists a pair of states {q±, < 72 } E B(o) 
such that P*(o) = distances f( qi) U distances f(? 2 )- 


Proof From Lemma 4 P*(o) is the union of a finite collection of intervals. Because this set is finite, there is an 
interval, say distances f( qi), whose lower bound is minimal; similarly there is an interval, say distancesp ( 92 )> 
whose upper bound is maximal. Therefore P*(o) = distances f(^i) U distancesp ( 92 ) ■ □ 

The optimal predictor exhibits some very interesting properties. 


Lemma 6 For all sequence o of observations, 

P*(oe) C P*(o) 0 1. 

Proof Let u\ C. u 2 be two finite traces such that |obs(u 2 )| = |obs(wi)| + 1. Then by definition dminp{qi ^4- 
) > dminF^qi -4) + 1 (unless dminpiqi -4) = 0). Similarly dmaxp{qi -4) < dmaxp{qi -4) + 1 (unless 
dmaXFiqi -4) = 00 ). 

Therefore distances f (qi -4) E distancesp (qi -4) © 1. 

For each state q 2 £ B(oe), there exists a state in <71 £ B(o) such that two such traces u\ E ^2 lead 
respectivement to q\ and q 2 (but notice that for some q±, there may be no such < 72 )- Therefore P*(oe) = 
U q2 £B(oe) distances F {q 2 ) E U, lG g(oe) distances F (qi) © 1 = P*(o) © 1. □ 

2 We assume that P(o) and P'(o) are undefined if o cannot be generated by the system (B(o) = 0). 
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The optimal predictor can be used to decide predictability. Indeed from Definition 2 any suboptimal 
predictor enjoys only a (non-necessarily strict) subset of (i, ^-predictability qualities of the optimal predictor. 
This is expressed in the following corollary where non-predictability is proved if (i,j) is a strict subset (c) 
of some prediction P*(o). 

Corollary 7 If dminF^qi) > i the system is not (i, j)-predictable iff there exists a sequence o of observations 
such that ( i,j ) C P*(o). 

Proof We assume dminF(qi) > i- 

4= Assume that there is no sequence o of observations such that (i,j) C P*(o). Consider a faulty trace 
w. We shall show that w is (i, j)-predictable. 

Let u Q w be a faulty prefix: ( q\ A) £ F. Then by Definition 1 of a predictor, P*(obs(u)) = (x,y) 
where x = 0. Notice also that P*(obs(£)) = (x e ,y e ) where x e > i. From Lemma 6 we know that adding one 
observation to a sequence can reduce the lower bound of the interval returned by P* only by 1. Therefore, 
since the lower bound is greater than or equal to i for e and down to 0 for obs(w), there is a prefix u' of u 
such that P*(u') = {x',y') and x' = i. But since (■ i,j ) <{_ (x’,y'), {x',y') C (■ i,j ) and the faulty trace w is 
(i, j)-P re dictable (Def. 2). 

=>- Let o be the sequence of observations such that (i,j) C P*(o) and let (x,y) be this interval P*(o). 
Notice that y > 1 since (i-j) is not empty. 

Assume y ^ oo. From Lemma 5 and from the definition of the time intervals there exists u\ C n 2 E w 
and u[ C u 2 Qw' such that 

• {re, w'} £ C, 

• obs(ui) = obs(n' 1 ) = o, 

• |obs(u 2 )| — |obs(ui)| = x, 

• (® A) e P, 

• lobsCA)! - |obs(«i)| = y- 1 , 

• (qi A) s Q\F. 

We shall prove by contradiction that w is not (i, j)-predictable. 

Assume that w is (i, j)-predictable. Then there exists a prefix U 3 of w such that P*(rt 3 ) C (i,j). Because 
of the first condition of Definition 1, this prefix must be such that |obs(w 2 )| — |obs(it 3 )| > i, and therefore 
|obs(rti)| — |obs(it 3 )| > 0. We know that P*(obs(zti)) 2 (bj), therefore |obs(ui)| — |obs(u 3 )| > 1 and U 3 C u\. 

Because U\ and u[ generate the same sequence of observations, there exists a prefix u ' 3 of u[ (and therefore 
of u' 2 ) that generates the same sequence of observations as u 3 . Furthermore, we know that |obs(?4)| — 
|obs(u 3 )| = |obs(u 2 )| — |obs(u 3 )| > |obs(u 2 )| — |obs(ui)| = |obs(u 2 )| — |obs(u , 1 )| = y — 1 ; that is: |obs(« 2 )| — 

|obs(u 3 )| > y. According to the second condition of Definition 1 , (q\ A) £ F, which contradicts the last 
item of the six items presented at the beginning of this proof. 

The proof under the assumption that y = 00 is very similar. We choose u ' 2 such that |obs(?4) | — |obs(u' 1 ) | > 
\Q\ + 2. This proves that the system is not (i, \Q\ + l)-predictable. Since we know that a bound bigger than 
|Q| is equivalent to that of 00, we show that the system is not (*, oo)-predictable. □ 

Notice that if dminF^qi) < i, then the system is not (i, j)-predictable for any j (even j = 00). 
Combining Corollary 7 and Lemma 5, we obtain the following theorem. 

Theorem 8 The system is (i, j)-predictable iff dminp(qi) < i and for all sequence o of observations, for all 
pair of states ( 91 , 92 ) Q B(o), 

(■i,j ) (jL distances f(9i) U distances f ( 92 )• 

We write 91 ~ 92 the relation indicating that the two states 91 and 92 appear together in a belief state. 
Notice that ~ is not an equivalence relation (it is not transitive). 



4.3 Algorithms 

We now turn to implementation of Theorem 8. The algorithm includes four steps: 

1. Compute the minimal distance for each state; 

2. Compute the maximal distance for each state; 

3. Compute the twin plant which represents the ~ relation; 

4. Compute the (i, j)-predictability. 

All parts of the verification process will be presented here to ensure the paper is self-contained. 

Algorithm 1 computes the minimal distance of each state. In this algorithm and the following one, 
c(e) = 1 if e is observable and 0 otherwise. It assumes that all states have infinite distance until it is has 
been proved that a shorter distance exists. It then sets all faulty states’ minimal distance to 0 and updates 
the minimal distances of all states until convergence is reached. To make sure that the states are explored 
in the optimal order we use a priority queue Q that orders its elements by smaller value however 

since Q only contains elements with two types of distances (the current distance and this distance plus one), 
the queue can be implemented with two buckets. The complexity of the algorithm is therefore linear in the 
number |T| of transitions. 


Algorithm 1 Computing the minimal distance. 

Input: an FSM (Q, S, T, qj, S 0 ), a set of states F C Q 
Create a table dmiriF : Q — > N U {oo} 

for all q £ Q do 

dmiriF (q) '■= oo 

end for 

Q = 0 

for all q £ F do 

dmin p(q) := 0 
Q:=QU{q} 

end for 

while Q ^ 0 do 

q' := pop(Q) 

for all (q, e, q') £ T do 

if dmiriF(q) > dmiriF^q') + c(e) then 
dmiriF (q) := dmiriF {q') + c(e) 

Q:=QU{q} 
end if 
end for 
end while 
return dmiriF 


Algorithm 2 computes the maximal distance of each state. It starts by computing the list of states ( N ) 
that can stay outside of F forever (those states have infinite maximal distance). It then initialises every 
state with a maximal distance of 0 and updates the distance whenever it finds a bigger value. This update 
will eventually terminate (after at most |Q| iterations). The first part of the algorithm requires to iterate 
twice over all transitions; the second part requires to iterate at most |Q| times over at most all transitions. 
Therefore the complexity of Algorithm 2 is at most \Q\ x \T\. 
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Algorithm 2 Computing the maximal distance. 

Input: an FSM (Q, S, T, qi, S 0 ), a set of states F C Q 
Let N ■- Q \ F 
Create map nsucc : N —> N 
i? := 0 

for all q € N do 

nsuccfg] := \{(q,e,q') £ T}\ 
if nsucc[g] = 0 then 
R := R U q 

end if 
end for 

while R ^ 0 do 

Let q' := pop(-R) 

for all (q,e,q') £ (N x Ex {q'}) do 
nsucc [g] := nsucc [g] — 1 
if nsucc[g] = 0 then 
R := R U g 

end if 
end for 
end while 

Create a table dmaxp : Q —» N U {oo} 

for all g € Q do 
if g € N then 

dmaXF(q) '■= oo 

else 

dmaXF(q) '■= 0 

end if 
end for 

needsUpdate := true 
while needsUpdate do 
needsUpdate := false 
for all g' £ Q \ N do 
for all (g, e, q') £ T do 

if dmaxF^q) < dmaXF^q') + c(e) then 
dmaxF(q) '■= dmaxF(q') + c(e) 
needsUpdate := true 
end if 
end for 
end for 
end while 
return dmaxF 
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The twin plant [JHCK01] is a construction that determines precisely the ~ relation. Notice that, strictly 
speaking, it is not necessary to build it as a finite state machine: for predictability only the ~ relation 
matters; not the transitions between the states of the twin plant. 

Given an FSM A = (Q , £, T, q\ , £ 0 ), the twin plant is the finite state machine (Q T , S T , T t , qf , £ 0 ) where 

• Q T = Q x Q, 

• £ T = ((£\£ 0 ) x {1,2}) U £ 0 , 

. T t = 

{«</!, <? 2 >,e T ,(«» | 

U {((q 1 ,q 2 ),e T ,{q , 1 ,q' 2 ))\ 

U {((qi,q 2 ),e T , {q' 1 ,q' 2 )) \ 

• <?i T = (Qi,Qi)- 

It is well-known that the state (( 71 ,( 72 ) of Q T is reachable from q[ iff qi ~ q 2 (Lennna 10, [GL09] where 
the twin plant is called verifier). The twin plant can therefore be used to verify the (*, j)-predictability. 

The procedure for computing the (*, j (-predictability is given in Algorithm 3. It generates an array p 
such that for all i , the system is (i,p[i])-predictable and non -{i,p\i\ — l)-predictable. The algorithm first 
initialises p[i\ to i. It then iterates over all the states of the twin plant and updates the table p. According 
to Theorem 8 the result of Algorithm 3 is the list of (i,p[i] (-predictabilities that the system enjoys. 


{qi,e T ,q[) GT A (q 2 ,e T ,q 2 ) G T A e T G £„ } 

(<7i, e T , q'i) £ T A q 2 = q ' 2 A e T <E S \ S Q } 

<7i = <7i A (q 2 ,e T ,q 2 )€T A e T € S \ S 0 }. 


Algorithm 3 Algorithm to compute (i, j)-predictability 
l: input an FSM A = ( Q , S, T, q\. S 0 ), the list of minimal and maximal distances dminF and dmaxF , the 
twin plant (Q T , £ T , T t , q[, S Q ) 

2: Create an array of integers p : Nl mm -P’(q'i)I 

3: for all i € {1,..., |Q|} do 
4: p[i] := i 

5: end for 

6: for all (q, q') G Reachable(Q T ) do 
7: i := min (dminpiq), dminF(q')) 

8: j := ma -x.{dmaxp{q), dmax fW)) 

9: p[i\ := max(p[i],j) 

10: end for 
11: return p 


We claim that the algorithm presented here is quadratic in the number of states and transitions of the 
system. It is easy to see that computing the distances is at most quadratic for both types of distances, and 
that the resulting structure has linear size with constant time access. The size of the twin plant is quadratic 
in the size of the original system (it includes at most \Q \ 2 states and (2 x |T| x \Q\) + \T \ 2 transitions)—and 
that is assuming a non-deterministic model. Finally the fourth step requires iterating over the quadratic 
number of states in the twin plant. 

Our definitions of predictability and i-predictability match those of Jeron et al. [JMGL08] and the 
proposed algorithm can therefore be used to verify these properties. It is also possible to simplify it by 
focussing on the i parameter. 

As a last result, consider a fully observable system , i.e., a system in which obs(u) = u. Then, at any time, 
the state of the system can be deduced from the sequence of observations; but notice that how the system 
will evolve remains unknown. Then the relation ~ is equivalent to identity: q ~ q' iff q = q'. Consequently, 
after the distances of each state have been computed, the predictability can be computed in linear time. 
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4.4 Building the Optimal Predictor 

Lemma 4 gives us a procedure for computing the optimal predictor. Similarly to diagnosis and its diagnoser 
[SSL+95] it is possible to compute a deterministic FSM that represents how the belief state evolves as more 
observations are gathered. 

Formally the optimal predictor is a finite state machine (Q*, £*, T*, qj*) where 

• Q* = {<7* I q * C Q}, 

• £* = So, 

• T* C Q* x S 0 * x Q* is defined below, and 

• = {<7i}- 

For every state qi* £ Q* of the optimal predictor and every event e £ S*, there is exactly one state qi * 
such that qi* A qi* is a transition of the optimal predictor. The state qi* C Q is defined as the set of states 
of the system that can be reached from a state of < 71 * through a path that generates only one observation: 

< 72 * = {<72 G Q I Ai € q^. 3u. (51 A q 2 ) A (|obs(«)| = 1)}. 

Given a sequence o of observations the predictor follows the single path labeled by o on the predic¬ 
tor and reaches the state q*{o) (i.e., the state q*(o) such that < 71 * A q*(o)). The prediction is then 
|J(jeg*(o) distances f (q) 3 Adding a single observation e to o, the new prediction can be easily computed 

by getting the state q*(oe) that satisfies q*{o) A q*{oe). Assuming the optimal FSM and the interval associ¬ 
ated with each state of the predictor are precomputed, the optimal prediction of a sequence of observations 
is linear in the size of this sequence and the incremental optimal prediction is constant time. Notice however 
that, as is the case with the diagnoser [Rin07], the optimal predictor is exponentially large in the number of 
states of the system. 

5 Related Work 

Predictability as presented in this paper was introduced by Gene and Lafortune [GL06]. Their approach 
was however only Boolean: they addressed the question “can the fault be predicted before it occurs?” 
They presented an exponential space algorithm using a structure similar to our optimal predictor. They 
also announced the existence of a polytime algorithm, similar to the twin plant used for diagnosability and 
formally presented in an extension of their work [GL09]. 

Together with Jeron and Marchand, they proposed an additional improvement to lower the complexity 
down to quadratic [JMGL08]. We claim here that their algorithm is not quite quadratic (we discuss this 
question at the end of this section). Their approach is very similar to the approach presented in the previous 
section: They construct a twin plant and verify predictability by checking whether there exists a pair q\ ~ <72 
such that dminF^qi) = 0 and dmaxF{q 2 ) = 00 . 

Brandan Briones and Madalinski presented the notions of /^-predictability and w 6 -predictability [BM11]. 
w 6 -predictability is similar to our definition of i-predictability meaning that the fault is predicted at least i 
observations before the fault occurs. / 6 -predictability is the equivalent of our property of (1,_7)-P r edictability, 
meaning that it is possible to predict the fault occurrence before it occurs but when at most j observations 
are still possible before the fault (in other words, the fault prediction is not too early). 

While this is a minor issue, we provide an example and a comprehensive discussion that illustrate the 
complexity error from Jeron et al. [JMGL08]. Consider the example of Figure 3a. This DES includes 
2n + 2 states and 4 n transitions. The single observable event is a and the single unobservable event is t 
(this example does not feature any faulty event). The twin plant then consists in 2 n 2 + 2 states and 4rc 2 

3 If the model is correct, then the state q*{o) = {} should never be reached and the union is therefore well-defined. 


12 





Figure 3: DES (a) and its e-reduction (b). 


Type of states 

Number of states 

Type of transitions 

Number of transition 

(A, A) 

1 

(A, A) —¥ ( Bi,Bj ) 

ri 2 

(Bi , Bj) 

n 2 

(Bi, Bj) —»• (C, C) 

n 2 

(C,C) 

1 

(C,C) -»■ (DuDj) 

n 2 

(Di ) Dj ) 

2 

n 

(Di,Dj) —> ( Di,Dj ) 

n 2 

Total: 

2 n 2 + 2 

Total: 

4n 2 


Table 2: Size of the twin plant for the DES in Figure 3a. 


Type of states 

Number of states 

Type of transitions 

Number of transition 

(A, A) 

1 

(A, A) —> (Bi,Bj) 

ri 2 

(Bi, Bj) 

2 

rr 

(Bi,Bj) —> (Dk,De) 

n 4 

(Di , Dj ) 

2 

n z 

(Di,Dj ) — > (Di,Dj) 

n 2 

Total: 

2(7+1 

Total: 

n 4 + 2 n 2 


Table 3: Size of the twin plant for the e-reduced DES in Figure 3b. 


transitions (details in Table 2). The e-reduction, presented on Figure 3b, contains one state fewer than the 
original DES but n 2 + 2 n transitions. As a consequence, the number of states in the twin plant reduces down 
to 2 n 2 + 1 but the number of transitions shoots up to n 4 + 2 n 2 (details in Table 3). 

6 Conclusion 

We presented a notion of (i, j (-predictability, an extension of predictability that specifies that there exists a 
time interval during which the fault occurrence is bound to happen in the system. This notion is very useful 
because it allows one to express different type of predictability, namely whether a fault can be predicted well 
in advance, whether the time of failure can be precisely predicted, or both. 

There are several obvious extensions to these works, mainly regarding the expressive power of the mod¬ 
elling framework. We want to extend this work to timed systems [CG13], to probabilistic systems [NDY14], 
or to hybrid systems [BTO08]. Other works include the extension of the current work to decentralised pre¬ 
dictors [TK12], the study of optimal observability for predictability akin to that of diagnosability [BLD08] 
or in combinaison with opacity constraints [CMPM14]. 
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